Overview

Dataset statistics

Number of variables16
Number of observations381109
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory57.5 MiB
Average record size in memory158.2 B

Variable types

Numeric10
Categorical6

Warnings

age_damage_premium is highly skewed (γ1 = 168.2687719) Skewed
id is uniformly distributed Uniform
id has unique values Unique

Reproduction

Analysis started2021-02-13 20:42:06.192227
Analysis finished2021-02-13 20:42:45.894614
Duration39.7 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct381109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean190555
Minimum1
Maximum381109
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:46.103887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile19056.4
Q195278
median190555
Q3285832
95-th percentile362053.6
Maximum381109
Range381108
Interquartile range (IQR)190554

Descriptive statistics

Standard deviation110016.8362
Coefficient of variation (CV)0.5773495117
Kurtosis-1.2
Mean190555
Median Absolute Deviation (MAD)95277
Skewness9.443273511 × 1016
Sum7.26222255 × 1010
Variance1.210370425 × 1010
MonotocityStrictly increasing
2021-02-13T17:42:46.227137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20491
 
< 0.1%
997381
 
< 0.1%
198751
 
< 0.1%
178261
 
< 0.1%
239691
 
< 0.1%
219201
 
< 0.1%
1099831
 
< 0.1%
1079341
 
< 0.1%
1140771
 
< 0.1%
1120281
 
< 0.1%
Other values (381099)381099
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
ValueCountFrequency (%)
3811091
< 0.1%
3811081
< 0.1%
3811071
< 0.1%
3811061
< 0.1%
3811051
< 0.1%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
0
206089 
1
175020 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1
ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%
2021-02-13T17:42:46.447486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:46.512463image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%

Most occurring characters

ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
0206089
54.1%
1175020
45.9%

age
Real number (ℝ≥0)

Distinct66
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.82258357
Minimum20
Maximum85
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:46.587222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile21
Q125
median36
Q349
95-th percentile69
Maximum85
Range65
Interquartile range (IQR)24

Descriptive statistics

Standard deviation15.51161102
Coefficient of variation (CV)0.3995512301
Kurtosis-0.5656550665
Mean38.82258357
Median Absolute Deviation (MAD)12
Skewness0.6725389977
Sum14795636
Variance240.6100764
MonotocityNot monotonic
2021-02-13T17:42:46.695804image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2425960
 
6.8%
2324256
 
6.4%
2220964
 
5.5%
2520636
 
5.4%
2116457
 
4.3%
2613535
 
3.6%
2710760
 
2.8%
288974
 
2.4%
438437
 
2.2%
448357
 
2.2%
Other values (56)222773
58.5%
ValueCountFrequency (%)
206232
 
1.6%
2116457
4.3%
2220964
5.5%
2324256
6.4%
2425960
6.8%
ValueCountFrequency (%)
8511
 
< 0.1%
8411
 
< 0.1%
8322
 
< 0.1%
8229
< 0.1%
8156
< 0.1%

region_code
Real number (ℝ≥0)

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.3888074
Minimum0
Maximum52
Zeros2021
Zeros (%)0.5%
Memory size13.9 MiB
2021-02-13T17:42:46.804501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q115
median28
Q335
95-th percentile47
Maximum52
Range52
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.22988803
Coefficient of variation (CV)0.5013446733
Kurtosis-0.8678571198
Mean26.3888074
Median Absolute Deviation (MAD)10
Skewness-0.1152664149
Sum10057012
Variance175.0299372
MonotocityNot monotonic
2021-02-13T17:42:46.964299image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28106415
27.9%
833877
 
8.9%
4619749
 
5.2%
4118263
 
4.8%
1513308
 
3.5%
3012191
 
3.2%
2911065
 
2.9%
5010243
 
2.7%
39251
 
2.4%
119232
 
2.4%
Other values (43)137515
36.1%
ValueCountFrequency (%)
02021
 
0.5%
11008
 
0.3%
24038
1.1%
39251
2.4%
41801
 
0.5%
ValueCountFrequency (%)
52267
 
0.1%
51183
 
< 0.1%
5010243
2.7%
491832
 
0.5%
484681
1.2%

policy_sales_channel
Real number (ℝ≥0)

Distinct155
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.0342947
Minimum1
Maximum163
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:47.119975image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile26
Q129
median133
Q3152
95-th percentile160
Maximum163
Range162
Interquartile range (IQR)123

Descriptive statistics

Standard deviation54.20399477
Coefficient of variation (CV)0.4838160935
Kurtosis-0.9708101781
Mean112.0342947
Median Absolute Deviation (MAD)19
Skewness-0.9000081235
Sum42697278
Variance2938.07305
MonotocityNot monotonic
2021-02-13T17:42:47.260575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152134784
35.4%
2679700
20.9%
12473995
19.4%
16021779
 
5.7%
15610661
 
2.8%
1229930
 
2.6%
1576684
 
1.8%
1545993
 
1.6%
1513885
 
1.0%
1632893
 
0.8%
Other values (145)30805
 
8.1%
ValueCountFrequency (%)
11074
0.3%
24
 
< 0.1%
3523
0.1%
4509
0.1%
63
 
< 0.1%
ValueCountFrequency (%)
1632893
 
0.8%
16021779
5.7%
15951
 
< 0.1%
158492
 
0.1%
1576684
 
1.8%

driving_license
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
1
380297 
0
 
812

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%
2021-02-13T17:42:47.531102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:47.616197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%

Most occurring characters

ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
1380297
99.8%
0812
 
0.2%

vehicle_age
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
1
200316 
0
164786 
2
 
16007

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row0
5th row0
ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%
2021-02-13T17:42:47.825175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:47.910388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%

Most occurring characters

ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
1200316
52.6%
0164786
43.2%
216007
 
4.2%

vehicle_damage
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
1
192413 
0
188696 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0
ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%
2021-02-13T17:42:48.121083image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:48.181439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%

Most occurring characters

ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
1192413
50.5%
0188696
49.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
0
206481 
1
174628 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1
ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%
2021-02-13T17:42:48.335922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:48.396224image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%

Most occurring characters

ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
0206481
54.2%
1174628
45.8%

annual_premium
Real number (ℝ≥0)

Distinct48838
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30564.38958
Minimum2630
Maximum540165
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:48.476229image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2630
5-th percentile2630
Q124405
median31669
Q339400
95-th percentile55176
Maximum540165
Range537535
Interquartile range (IQR)14995

Descriptive statistics

Standard deviation17213.15506
Coefficient of variation (CV)0.563176798
Kurtosis34.0045687
Mean30564.38958
Median Absolute Deviation (MAD)7504
Skewness1.766087215
Sum1.164836395 × 1010
Variance296292707
MonotocityNot monotonic
2021-02-13T17:42:48.595820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
263064877
 
17.0%
69856140
 
< 0.1%
3900841
 
< 0.1%
3828738
 
< 0.1%
4517938
 
< 0.1%
3011736
 
< 0.1%
4370736
 
< 0.1%
3507435
 
< 0.1%
3608635
 
< 0.1%
3845234
 
< 0.1%
Other values (48828)315799
82.9%
ValueCountFrequency (%)
263064877
17.0%
60981
 
< 0.1%
76701
 
< 0.1%
87391
 
< 0.1%
97921
 
< 0.1%
ValueCountFrequency (%)
5401654
< 0.1%
5080731
 
< 0.1%
4951061
 
< 0.1%
4896631
 
< 0.1%
4720423
< 0.1%

vintage
Real number (ℝ≥0)

Distinct290
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean154.3473967
Minimum10
Maximum299
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:48.714880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile24
Q182
median154
Q3227
95-th percentile285
Maximum299
Range289
Interquartile range (IQR)145

Descriptive statistics

Standard deviation83.67130363
Coefficient of variation (CV)0.5420972781
Kurtosis-1.200688042
Mean154.3473967
Median Absolute Deviation (MAD)73
Skewness0.00302951689
Sum58823182
Variance7000.887051
MonotocityNot monotonic
2021-02-13T17:42:48.828518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2561418
 
0.4%
731410
 
0.4%
2821397
 
0.4%
1581394
 
0.4%
1871392
 
0.4%
311388
 
0.4%
1601388
 
0.4%
2261388
 
0.4%
1311387
 
0.4%
2451387
 
0.4%
Other values (280)367160
96.3%
ValueCountFrequency (%)
101311
0.3%
111344
0.4%
121257
0.3%
131329
0.3%
141260
0.3%
ValueCountFrequency (%)
2991283
0.3%
2981384
0.4%
2971284
0.3%
2961302
0.3%
2951275
0.3%

response
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.9 MiB
0
334399 
1
46710 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters381109
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0
ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%
2021-02-13T17:42:49.043982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-13T17:42:49.107134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%

Most occurring characters

ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number381109
100.0%

Most frequent character per category

ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%

Most occurring scripts

ValueCountFrequency (%)
Common381109
100.0%

Most frequent character per script

ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII381109
100.0%

Most frequent character per block

ValueCountFrequency (%)
0334399
87.7%
146710
 
12.3%

age_damage
Real number (ℝ≥0)

Distinct66
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4340.278361
Minimum2
Maximum7324
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:49.183227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1451
Q12600
median4322
Q35845
95-th percentile7141
Maximum7324
Range7322
Interquartile range (IQR)3245

Descriptive statistics

Standard deviation1899.273687
Coefficient of variation (CV)0.4375925986
Kurtosis-1.202209861
Mean4340.278361
Median Absolute Deviation (MAD)1559
Skewness0.06601177171
Sum1654119146
Variance3607240.537
MonotocityNot monotonic
2021-02-13T17:42:49.589332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
665925960
 
6.8%
697924256
 
6.4%
714120964
 
5.5%
479120636
 
5.4%
732416457
 
4.3%
290913535
 
3.6%
240810760
 
2.8%
24248974
 
2.4%
58458437
 
2.2%
57718357
 
2.2%
Other values (56)222773
58.5%
ValueCountFrequency (%)
211
 
< 0.1%
711
 
< 0.1%
1022
 
< 0.1%
1329
< 0.1%
3056
< 0.1%
ValueCountFrequency (%)
732416457
4.3%
714120964
5.5%
697924256
6.4%
665925960
6.8%
58458437
 
2.2%

vintage_annual_premium
Real number (ℝ≥0)

Distinct302632
Distinct (%)79.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean363.7404254
Minimum8.795986622
Maximum33717.28571
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:49.863194image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum8.795986622
5-th percentile12.23255814
Q1117.3202847
median194.7523364
Q3374.1889764
95-th percentile1345.682989
Maximum33717.28571
Range33708.48973
Interquartile range (IQR)256.8686917

Descriptive statistics

Standard deviation548.6531224
Coefficient of variation (CV)1.508364438
Kurtosis107.7098648
Mean363.7404254
Median Absolute Deviation (MAD)103.7178537
Skewness5.741464175
Sum138624749.8
Variance301020.2488
MonotocityNot monotonic
2021-02-13T17:42:49.983322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.20231214261
 
0.1%
109.5833333256
 
0.1%
9.460431655256
 
0.1%
12.00913242256
 
0.1%
87.66666667256
 
0.1%
12.11981567255
 
0.1%
17.53333333255
 
0.1%
41.74603175254
 
0.1%
9.392857143254
 
0.1%
10.31372549253
 
0.1%
Other values (302622)378553
99.3%
ValueCountFrequency (%)
8.795986622231
0.1%
8.825503356235
0.1%
8.855218855245
0.1%
8.885135135217
0.1%
8.915254237190
< 0.1%
ValueCountFrequency (%)
33717.285711
< 0.1%
25948.230771
< 0.1%
25876.538461
< 0.1%
22506.8751
< 0.1%
21637.51
< 0.1%

age_vintage
Real number (ℝ≥0)

Distinct12101
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean168.5820243
Minimum24.41471572
Maximum2920
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:50.120112image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum24.41471572
5-th percentile33.18181818
Q156.47169811
median91.8707483
Q3172.8947368
95-th percentile589.6153846
Maximum2920
Range2895.585284
Interquartile range (IQR)116.4230387

Descriptive statistics

Standard deviation230.4170123
Coefficient of variation (CV)1.36679467
Kurtosis23.01507351
Mean168.5820243
Median Absolute Deviation (MAD)44.77397411
Skewness4.110853617
Sum64248126.69
Variance53091.99956
MonotocityNot monotonic
2021-02-13T17:42:50.248443image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3651366
 
0.4%
121.66666671346
 
0.4%
182.51329
 
0.3%
91.251258
 
0.3%
731161
 
0.3%
60.83333333994
 
0.3%
52.14285714828
 
0.2%
146694
 
0.2%
730663
 
0.2%
243.3333333657
 
0.2%
Other values (12091)370813
97.3%
ValueCountFrequency (%)
24.4147157223
< 0.1%
24.496644326
< 0.1%
24.5791245826
< 0.1%
24.6621621627
< 0.1%
24.7457627123
< 0.1%
ValueCountFrequency (%)
29205
< 0.1%
2883.54
< 0.1%
28475
< 0.1%
2810.56
< 0.1%
27744
< 0.1%

age_damage_premium
Real number (ℝ≥0)

SKEWED

Distinct271591
Distinct (%)71.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.61946659
Minimum0.3590933916
Maximum27663
Zeros0
Zeros (%)0.0%
Memory size13.9 MiB
2021-02-13T17:42:50.477444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.3590933916
5-th percentile0.5166994106
Q14.186395939
median6.994108473
Q311.65038462
95-th percentile26.56232203
Maximum27663
Range27662.64091
Interquartile range (IQR)7.463988676

Descriptive statistics

Standard deviation117.9591069
Coefficient of variation (CV)11.10781845
Kurtosis33237.54538
Mean10.61946659
Median Absolute Deviation (MAD)3.456939872
Skewness168.2687719
Sum4047174.292
Variance13914.35089
MonotocityNot monotonic
2021-02-13T17:42:50.596391image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.39495419732629
 
0.7%
0.37684482022496
 
0.7%
0.54894594032250
 
0.6%
0.36829575692106
 
0.6%
0.90409075281852
 
0.5%
0.35909339161830
 
0.5%
1.0921926911642
 
0.4%
0.48292324641624
 
0.4%
1.0849834981603
 
0.4%
0.44995722841549
 
0.4%
Other values (271581)361528
94.9%
ValueCountFrequency (%)
0.35909339161830
0.5%
0.36829575692106
0.6%
0.37684482022496
0.7%
0.39495419732629
0.7%
0.44995722841549
0.4%
ValueCountFrequency (%)
276631
< 0.1%
26120.51
< 0.1%
25939.51
< 0.1%
24683.51
< 0.1%
21944.51
< 0.1%

Interactions

2021-02-13T17:42:31.646642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:31.797785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:31.951165image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.105079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.239874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.377300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.512356image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.652704image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.784324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:32.920713image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.058446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.214820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.401706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.565563image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.769582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:33.964056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.102488image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.268853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.401762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.533770image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.658658image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.790717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:34.918772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.047783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.171047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.302519image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.424335image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.560201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.751873image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:35.899736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.040945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.194198image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.356377image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.507792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.654269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.787229image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:36.919400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.071077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.216641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.353134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.490654image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.618819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.753000image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:37.914325image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.057415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.183045image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.319183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.453779image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.607966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.767345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:38.900036image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.030219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.162514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.305269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.460560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.611389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.749156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:39.880747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.016820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.148393image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.277663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.417783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.560558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.706801image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.852924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:40.985618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.116884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.254822image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.386553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.546362image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.698244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.837465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:41.995421image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.149770image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.292753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.424039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.580675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.719698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:42.860743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.014979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.155444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.283257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.416283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.539744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.662918image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.793262image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:43.918259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:44.050455image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:44.214831image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-13T17:42:44.353835image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-02-13T17:42:50.728174image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-13T17:42:50.921005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-13T17:42:51.103266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-13T17:42:51.300297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-13T17:42:51.465757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-13T17:42:44.677745image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-13T17:42:45.220975image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idgenderageregion_codepolicy_sales_channeldriving_licensevehicle_agevehicle_damagepreviously_insuredannual_premiumvintageresponseage_damagevintage_annual_premiumage_vintageage_damage_premium
0104428.0000026.00000121040454.0000021715771186.4239674.009227.00988
120763.0000026.00000110033536.000001830837183.25683151.5847040.06691
2304728.0000026.00000121038294.0000027150901418.29630635.370377.52338
3402111.00000152.00000100128619.0000020307324140.9803037.758623.90756
4512941.00000152.00000100127496.000003902504705.02564271.4102610.98083
5612433.00000160.0000010102630.000001760665914.9431849.772730.39495
6702311.00000152.00000101023367.000002490697993.8433733.714863.34819
7815628.0000026.00000111032031.000007212763444.87500283.8888911.59283
891243.00000152.00000100127619.000002806659986.39286312.857144.14762
9101326.00000152.00000100128771.000008002560359.63750146.0000011.23867

Last rows

idgenderageregion_codepolicy_sales_channeldriving_licensevehicle_agevehicle_damagepreviously_insuredannual_premiumvintageresponseage_damagevintage_annual_premiumage_vintageage_damage_premium
38109938110015128.0000026.00000111044504.000007104077626.81690262.1831010.91587
38110038110112928.00000124.00000101049007.0000013702504357.7153377.2627719.57149
38110138110217028.00000122.00000121050904.0000021501398236.76279118.8372136.41202
38110238110312541.00000152.0000010112630.000001020479125.7843189.460780.54895
38110338110404750.0000026.00000111039831.0000023505090169.4936273.000007.82534
38110438110507426.0000026.00000110130170.000008801084342.84091306.9318227.83210
38110538110603037.00000152.00000100140016.0000013102511305.4656583.5877915.93628
38110638110702130.00000160.00000100135118.0000016107324218.1242247.608704.79492
38110738110816814.00000124.00000121044617.000007401451602.93243335.4054130.74914
38110838110904629.0000026.00000110041777.0000023705471176.2742670.843887.63608